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Abstract 

We study a representation of an antimatroid by Horn rules, motivated by its recent 
application to computer-aided educational systems. We associate any set TZ of Horn rules 
with the unique maximal antimatroid A{TZ) that is contained in the union-closed family 
JCiJZ) naturally determined by TZ. We address algorithmic and Boolean function theoretic 
aspects on the association TZ >->■ A{TZ), where TZ is viewed as the input. We present 
linear time algorithms to solve the membership problem and the inference problem for 
A{TZ). We also provide efficient algorithms for generating all members and all implicates 
of AiTZ). We show that this representation is essentially equivalent to the Korte-Lovasz 
representation of antimatroids by rooted sets. Based on the equivalence, we provide a 
quadratic time algorithm to construct the uniquely-determined minimal representation. 
These results have potential applications to computer-aided educational systems, where 
an antimatroid is used as a model of the space of possible knowledge states of learners, 
and is constructed by giving Horn queries to a human expert. 

Keywords: Antimatroids; Horn rules; Implicational systems; Learning spaces; 
Knowledge spaces; Educational systems 


1 Introduction 

An antimatroid is a family /C of subsets of a finite set Q satisfying the following conditions: 

(Union-closedness) For members X, Y of 1C, the union X U K is also a member of JC. 

(Accessibility) For every nonempty member X of /C, there exists an element x in X such 
that X \ {x} is a member of 1C. 

(We do not impose the usual condition Q £ 1C.) An antimatroid (or its dual, convex geometry) 
is an axiomatic abstraction of a finite point set in Euclidean space, and ubiquitously arises 
from various areas of discrete mathematics and theoretical computer science. Examples appear 
from graph search, tree shelling, posets, and so on [niEi. One of the major applications 
of antimatroids is the analysis of greedily solvable structures in combinatorial optimization; 
see [30] . 

A remarkable application of antimatroids has been emerging from the design of computer- 
aided education systems na [m |22]. In Knowledge Space Theory (KST), an antimatroid is 
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called a learning space, whereas a union-closed family is called a knowledge space. They are 
used as mathematical models of the space of all possible knowledge states of learners, where 
the ground set Q is a set of questions and the knowledge state of a learner is associated with 
the subset of questions that he/she answers correctly. KST-based educational systems have 
already been in practical use, e.g., ALEK^ In this application, the size of antimatroids can be 
quite large. Thus efficient representation, construction, and implementation of antimatroids 
are of great importance. 

In the literature of KST, Dowling-Miiller [m |32] and Koppen and Doignon [28] inde¬ 
pendently introduced a rule-based representation of union-closed families by certain binary 
relations, called entailments. They established a Galois connection between union-closed fam¬ 
ilies and entailments. In fact, their result may be viewed as a sharpening of the representation 
of families of Horn rules (or Horn formulas), a well-known concept in artificial intelligence and 
Boolean function theory; see m Chapter 6]. By a Horn rule (or rule) we mean a pair (A, q) 
of a set A C Q and an element q in Q. We say that a rule {A, q) accepts a subset X ii q ^ X 
implies A n A / 0. For a set TZ of rules, let K.{TZ) denote the set of subsets accepted by all 
rules in TZ. A classical result IMEIj in formal logic says that a family K, of subsets (including 
0) is union-closed if and only if it is represented by a set TZ of Horn rules as 1C = 1C{TZ); see [TTl 
Theorem 6.6]. Theory of implicational systems |9l[T0l[3ll|36| provides a unified and systematic 
approach to this classical result, generalizations, and ramifications, obtained in different fields 
of mathematical sciences (e.g., formal logic, lattice theory, formal concept analysis, relational 
database, and KST). In this theory, a Horn rule is called a unit implication, and a set TZ of 
Horn rules with K. = K.{TZ) is called a unit implicational basis of 1C. 

By definition, an antimatroid is a union-closed family. Thus an antimatroid A can be 
realized in computers by maintaining some implicational base 7^ of A = ICiTZ). Various 
operations on 1C{TZ) can be efficiently conducted by accessing TZ. In practice, a large family 
can often be represented by a small set of rules. Also in the literature of implicational systems, 
several “useful” or “compact” implicational bases of antimatroids (or convex geometries) 
and related closure systems have been investigated; see m s n issuMj. However this way 
of representing an antimatroid by 1C{TZ) has one obvious drawback: Not every set of rules 
corresponds to an antimatroid. Therefore we need a special care to keep 1C{TZ) an antimatroid 
when TZ is frequently varied by addition/deletion. Such a situation naturally occurs in the 
design of KST-based educational systems, and this drawback has been one of main difficulties 
for practical use of antimatroids. 

In this paper, we overcome this drawback by another way of associating any set of rules 
with an antimatroid. Our starting point is the following: 

There exists a unique maximal antimatroid contained in any union-closed family. 

This fundamental fact was recently noticed by Doignon na p. 14] in KST, and is a direct 
corollary of a classical result |18[ Theorem 2.2] of Edelman that for two antimatroids A, A^ in 
the same ground set, the family {X U A' | A G A, A' G A^} is again an antimatroid. We will 
use the following explicit characterization of this maximal antimatroid. For finite sets A and 
Y with V C A, a tight path from Y to A is a sequence Y = Yq,Yi, ... ,Yi. = X of subsets of 
Q satisfying V ^ V-i-i and |li+i Wl = 1 for i = 0,... ,k — 1. For a union-closed family 1C on 
Q, let A denote the family of subsets K in fC such that there is a tight path from 0 to A in 
1C. Then it holds: 

Theorem 1.1. For a union-closed family 1C, the family iC is the unique maximal antimatroid 
contained in 1C. 

For a set TZ of rules, we define A{TZ) as the maximal antimatroid lC{TZ) in the union-closed 
family KATZ). 

^http://www.aleks.com/. 
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The main subjects of this paper are fundamental algorithmic aspects on the association 
TZ I—)• A{TZ), and their implications to the educational system design. To begin with, let us 
formalize our algorithmic setting. We are given a set TZ of rules as an input, where the size of 
7Z is its coding length 1{TZ) := q)g7^(|^| + !)• We address algorithms and computational 

complexity for basic problems of handling A{TZ) by TZ. Notice that this is different from a 
standard setting in implicational systems: an implication basis of a family in question is given. 
Indeed, 7Z is not necessarily an implicational basis of A{7Z). 

We first consider the membership problem for A{7Z)-. 

Membership problem 

Input: A set 7Z of rules and a set A. 

Task: Determine whether X belongs to ^(7^). 

Whereas the membership problem for JC{TZ) is easily solved (in linear time), the computational 
complexity of the membership problem for A(TZ) is not trivial, and is in NP since a tight 
path from 0 to A in 1C{TZ) is a polynomial certihcate. If the membership problem can be 
solved efficiently, then one might say that an antimatroid A can be realized in computers by 
maintaining TZ with A = A{TZ). We show that the membership problem for ^(7^) can also 
be solved in linear time. 

Theorem 1.2. The membership problem for AiJZ) can be solved in linear time. 

Based on this linear time membership algorithm, we will provide an efficient algorithm to 
enumerate all members of A{TZ). 

We next consider the inference problem, which is motivated by the query learning for 
educational systems; see below. A rule (A, q) is called an implicate of a family K, if (A, q) 
accepts all the members of JC. 

Inference problem 

Input: A set TZ of rules and a rule (A, q). 

Task: Determine whether (A, q) is an implicate of A(7^). 

We show that this problem can also be solved efficiently. 

Theorem 1.3. The inference problem for A{TZ) can be solved in linear time. 

It turns out that this construction TZ i— )■ A{TZ) of an antimatroid is essentially equivalent to 
the construction of an antimatroid from rooted sets or circuits by Korte and Lovasz [55]; see 
m Section III. 3]. We will establish this equivalence. Korte and Lovasz showed the existence 
of the unique minimal representation of antimatroids by special circuits, called critical circuits. 
Translating their result, for an antimatroid A, there is a uniquel minimal set TZ* of rules such 
that A(7^*) = A, where TZ* is minimal in its cardinality as well as its size. A rule in TZ* is said 
to be critical for A. We will provide a quadratic time algorithm to construct this minimal set 
TZ* from a given TZ. 

Theorem 1.4. For a given set TZ of rules, the set of critical rules for AiJZ) can he obtained 
in quadratic time. 

As an application, we can determine, in quadratic time, whether two sets of rules define 
the same antimatroid. 

The representation TZ e-;■ A(7^) fits naturally into the query learning of antimatroids arising 
from the design of KST-based educational systems, which is our practical motivation of this 
paper. Koppen (55] and others |5Hl [35] considered a procedure QUERY to build a space of 
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knowledge states by asking a series of queries (^41, qi), (yl 2 , Q 2 ), • ■ •, to a human expert. Here 
a query {A,q) is the question: does a learner fail to solve question q provided he/she fails to 
solve every question in A? Then, for the set TZ of ‘yes’ queries, the space of knowledge states 
is estimated as K,{1Z). The QUERY procedure is designed for the case where the space is 
modeled as a union-closed family (a knowledge space). Therefore, the resulting JC{TZ) is not 
necessarily an antimatroid. In practical situations, however, educational systems need to use 
an antimatroid (a learning space) as a model. This leads to the following question given by 
Doignon and Falmagne in |22l p. 335]: 

This raises the following problem: assuming that, except for errors, the responses 
to the queries are dictated by a latent learning space C, can a learning space approx¬ 
imating L be derived by the querying method through some elaboration of QUERY ? 


They developed a relatively complicated adaptation of QUERY, called adapted QUERY, that 
always keeps the estimated space an antimatroid by careful managing of ‘yes’ queries and the 
surmise function; see [22l Chapter 16] for detail. 

Our results suggest a simple revision of QUERY to use A{TZ) instead of JC{TZ). Actually 
this approach was also suggested by Doignon under the name of adjusted QUERY, though an 
effective way of handling A(TZ) was still under investigation'^ |14l p.l4]. Now the adjusted 
QUERY is efficiently implementable. We believe that this is a desired elaboration of QUERY, 
which affirmatively answers the above question. Indeed, by Theorem [13 the resulting space 
A{TZ) is always an antimatroid that includes the target antimatroid C, and might be a reason¬ 
able approximation of C. As seen in Theorem 1.2 the association TZ 1 —;■ A{TZ) is manageable 
in computer. Moreover we can avoid giving redundant queries to the expert by making use of 


the algorithm in Theorem 1.3 


Related work. Eppstein, Ealmagne, and Uzun [20] consider a different approach to the 
above question of Doignon and Falmagne according to base families of antimatroids; see 
also |221 Section 16.3]. Here the base of a (union-closed) family JC consists of members of 
JC that are not able to be represented as a union of other members. Clearly the base B 
of 1C can recover the original 1C completely. Eppstein, Falmagne, and Uzun study several 
algorithmic questions on the base of an antimatroid (a well-graded family, more generally). 
They developed a polynomial time algorithm to determine whether given a family B is the 
base of an antimatroid. Moreover they also provided a polynomial time algorithm to construct 
(the base of) a minimal antimatroid containing given a (union-closed) family. These results 
lead to another elaboration of the QUERY algorithm, which first estimates a union-closed 
family K. by the original QUERY algorithm, and, from the base of 1C, constructs and outputs 
a minimal antimatroid A containing 1C. Our approach may be viewed as a counter part of 
theirs, since it constructs a maximal antimatroid contained in a union-closed family. 

Wild |35) applies a compression technique to the antimatroid construction in KST. This 
technique encodes a union-closed family C into a {0,1,2, n}-valued matrix for which each row 
vector is a compressed expression of a subfamily of L and L is the disjoint union of these 
subfamilies. This matrix is constructed from an implicational basis of C. He discusses how to 
work the adapted QUERY with this encoding. It would be an interesting research direction 
to incorporate this compression technique into our revision (or adjusted QUERY). 


Organization. The rest of the paper is organized as follows. In Section we give prelimi¬ 
nary arguments including a proof of Theorem IM} We also summarize some basic relationship 
among Horn rules, entailments, closure operators, and convex geometries, with the help of 
results in implicational systems, and then explain the Korte-Lovasz representation. In Sec¬ 
tion we give algorithmic results. We present algorithms for Theorems BB and |1.4| 
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Also we give an efficient algorithm to generate all members of A{TZ) and a resolution-type 
algorithm to generate all implicates of A{TZ). In Section we discuss the above revised 
QUERY algorithm in more detail. Preliminary experimental results show that the revised 
QUERY algorithm reduces 30% of queries that are needed by the original QUERY when the 
target space is an antimatroid. In Section we mention open problems and future research 
issues. 


2 Preliminaries 

Throughout the paper, Q denotes a finite set. By a family we mean a family of subsets of Q. 
The union A U {a} of a set A and an element a is denoted by A -|- a. The difference A \ {a} 
is denoted by A — a. We assume that any union-closed family K, considered in this paper 
contains the empty set, i.e., 

0G/C, 

whereas any intersection-closed family contains the whole set Q. Eor a union-closed family JC 
and a set A C Q, there uniquely exists a maximal subset Y £ JC contained in X. This Y is 
denoted by X°. 

An antimatroid is usually defined as a family X on Q satisfying (Union-closedness), 
(Accessibility), and 

Q G/C 

We here call such an antimatroid proper. Eor an improper antimatroid A, every subset is 
contained in Q°(G A). Hence A is regarded as a family on Q°, and is a proper antimatroid 
on Q°. Thus known results and properties for (proper) antimatroids are easily adjusted for 
improper ones. We remark that a union-closed family may or may not contain a proper 
antimatroid. Notice that a union-closed family /C contains a proper antimatroid if and only 
if there is a tight path from 0 to Q in /C. 

As explained in several papers of implicational systems, a set TZ of Horn rules is naturally 
identified with a (pure) Horn Boolean CNE (/j, and X{TZ) is identified with the set of true points 
of if] see [H Section 5] and [36l Section 3.4]. Various Boolean function theoretic concepts and 
algorithms are easily adapted to our setting. In Appendix, we briefly summarize this relation 
to Horn Boolean CNEs. 


2.1 The unique maximal antimatroid in a union-closed family 

Here we prove Theorem Let /C be a union-closed family. We first show that V is an 
antimatroid. Since the accessibility immediately follows from the definition of JC, it suffices 
to show that JC is union-closed. Let X and Y be members of X. By definition, there are tight 
paths 0 = Xq,Xi, ..., Xk = X and 0 = Yq, Yi,..., Ym = Y in X. Then the distinct members 
of {X UYi I i = 0,..., m} form a tight path from X to A U Y. By the union-closedness of 
X, all of them are members of X. Combining it with the tight path from 0 to A, we obtain a 
tight path from 0 to A U Y in X. Hence we have A U Y £ X. 

Einally we show the maximality of JC. Let L be an antimatroid contained by X. By the 
accessibility, for every member A of £, there is a tight path from 0 to A in X. Hence C <£ JC, 
and Theorem 1.1 is proved. 

Example 2.1. Let Q = {0,1, 2, 3,4, 5, 6}. Let TZ be the set of rules consisting of 


({0,1, 2, 3}, 6), ({0,1,3, 5,6}, 2), ({0, 3,4, 5, 6}, 2), ({0,3, 5, 6}, 1), ({0, 5, 6}, 4), 
({l,2,3},0),({l,4,5},2),({l,5},4),({2,3,4},l),({2,3,6},4),({2,5,6},4),and ({2,6},3). 


As in Eigure[^ there are 26 members in A{TZ) and 53 members in X(JZ), where members in 
AiJZ) are colored gray. 
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Figure 1: lC{TVj and ^(7^). 

2.2 Closure operator and entailment 

Here we first summarize the basic relationship between Horn rules, entailments, and closure 
operators. This is essentially one given by Dowling m Section 2]; see [9l |36] from the view of 
implicational systems. We then discuss entailments for antimatroids (or convex geometries). 

We allow rule {A, q) to be G H; such a rule accepts every subset, and is called trivial. 
Thus a set of rules is a subset of 2*^ x Q, and is a binary relation between 2'^ and Q (called 
an implication relation by Dowling [l6]). An entailment {entail relation) on Q is a binary 
relation 7^ C 2*5 x Q satisfying 

(El) for all A C Q and g G A, it holds ATZq. 

(E2) for all nonempty A,B C Q and g G Q, if ATZb holds for all 6 G 77 and BTZq holds, then 
ATZq holds. 

Here we say “AT^g holds’' if (A, g) G TZ. In the terminology of implicational systems, an 
entailment is a full unit implicational system (full UIS) [9]. 

The following is a fundamental relationship between union-closed families and entailments; 
recall that an implicate of a family /C is a rule (A, g) accepted by all members of JC, i.e., 
A n X / 0 or g 0 X for all A G /C. 

Theorem 2.2 (Galois connection [23E2]; see [23 Chapter 7]). (1) For 1C C 2*5^ the setTZ 

of all implicates of K, is an entailment, and 1C{TZ) is the unique minimal union-closed 
family containing 1C. In particular, if 1C is union-closed, then K. = KATZ). 

(2) For 77 C 2*5 X Q, the set of all implicates of 1C{TZ) is the unique minimal entailment 
containing TZ. In particular, if TZ is an entailment, then TZ is equal to the set of all 
implicates of 1C (TZ). 

Eor a family 1C, let K.* denote the dual of K, defined by 

XX= {Q\X I X G X}. 

The dual of a union-closed family is an intersection-closed family (or a closure system). Eor 
an intersection-closed family A on Q, we can define a map r : 2*5 —2*5, called the closure 
operator, by 

t(X) := G 77 I X C T} (X C Q). 
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For a union-closed family 1C and the closure operator r of dual 1C*, it obviously holds that 


Q\T(X) = (g\X)° (XCQ). 


( 2 . 1 ) 


We will often use the following relation between implicates of 1C and closure operator r of /C*, 
which was given by Dowling-Muller [16( I32| in the literature of KST. 

Lemma 2.3 ([32l Proposition 4.4 (i)]; see [Ml Proposition 2.8]). Let 1C he a union-closed 
family and let r be the closure operator of the dual 1C*. For a rule {A,q), the following 
conditions are equivalent: 


(i) {A,q) is an implicate of 1C. 

(ii) q E t{A). 


(hi) q^{Q\ Ay 


Proof. The equivalence between (ii) and (iii) follows from (2.1). Condition (ii) is equivalent 
to: q belongs to HIQ \ W|WE/C:Xn^ = 0}. This is further equivalent to: every X ^ 1C 
disjoint with A satishes q X. This is nothing but condition (i). □ 

We next give a characterization of the entailment of an antimatroid. As Koppen did in 123 
p.l42, (27)], such a characterization of TZ is directly obtained from (Accessibility) as: 

For every subset X F Q with XnAy0 or q^X for all (A, q) E TZ, there exists an element 
X in X such that (A — x) n A / 0 or (7 0 A — x for all (A, q) E TZ. 


This is a rather cumbersome condition. We here provide another useful characterization, 
which is directly obtained from the anti-exchange property of convex geometries dual to 
antimatroids. 

A convex geometry [Ml is an intersection-closed family with its closure operator r satisfying 
the following anti-exehange property (AE): 


(AE) for A C Q and distinct y,z € Q, ii y,z ^ z E r(A -|- y) then y ^ t(A -|- z ). 

It is well known that a family is a convex geometry if and only if its dual is an antimatroid m 
Theorem 1.3]. Therefore, by using Lemma 2.3 and Theorem 2.2, the condition (AE) is trans¬ 
lated into a condition for an entailment TZ to have an antimatroid 1C(TZ){= A{TZ)). We will 
use the following slightly different characterization. 


Proposition 2.4. For an entailment TZ on Q, the family 1C{TZ) is an antimatroid if and only 
if TZ satisfies: 

(AE') for X C Q and distinct y,z G Q, if (A -|- y)TZz and (A -|- z)TZy hold, then XTZy and 
XTZz hold. 


Proof. Observe that (AE') implies (AE). We show the converse. Suppose that entailment TZ 
satisfies (AE). To show (AE'), suppose that (A + y)TZz and (A + z)TZy hold. By (AE), XTZy 
or XTZz holds. By symmetry, we can assume that XTZy holds. Then, by (El), XTZx holds 
for all X E A -|- y. By (E2) and (A -|- y)TZz, we have XTZz, and obtain (AE'). □ 

Based on this characterization, in Section we develop an algorithm to generate all im¬ 
plicates of A (7^). 
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2.3 Korte-Lovasz representation 

We here explain the relationship between the above construction of an antimatroid from Horn 
rules and the construction of an antimatroid from rooted sets by Korte and Lovasz [29]; see 
also m Section III. 3]. A rooted set {C,r) is a pair of a subset C C Q and an element r 
in C, where r is called the root of (C*, r). For a given family C of rooted sets, Korte and 
Lovasz defined a family C{C) C 2^ as follows: A subset X is a member of C{C) if and 
only if there is an ordering xi,X 2 , ■ ■ ■ ,Xk of the elements in X such that (C, Xi) G C implies 
C n {xi,X 2 ,..., Xi-i} / 0 for each i. In [25|, a family (or a language) determined in this way 
is called a transversal precedence structure. 

Korte and Lovasz showed that C{C) is always an antimatroid [301 Lemma 3.2]. In fact, 
C{C) is equal to .4,(7^), provided a nontrivial rule {A, q) is associated with a rooted set {A+q, q). 
The map {A, ( 7 ) i—?• (A + q, q) is a bijection between the set of all nontrivial rules and the set 
of all rooted sets. 

Theorem 2.5. For a set TZ of nontrivial rules, let C be the set of rooted sets defined by 
C := {{A + q,q) \ {A, q) G TZ}. Then it holds A{TZ) = C{C). 

Proof. Pick X from £(C). By definition, there is an ordering xi,X 2 ,... ,Xk of X such that 
{A,Xi) G TZ implies A n {xi,X 2 ,...,Xj_i} / 0. This means that {xi,X 2 ,... ,Xi} G K,{TZ) 
for i = 1, 2,..., A;. Thus {xi,X 2 ,..., Xi} {i = 1, 2,..., A:) form a tight path from 0 to X 
in JC{TZ), and X G A{TZ). Conversely, pick X from A{TZ). Then there is a tight path 
0, {xi}, {xi,X 2 },..., {xi,X 2 ...,Xfc} = X in K,{TZ). The ordering xi,X 2 ,... ,Xk of X fulhlls 
the definition of C{C), hence X G T(C). Thus we conclude that A{TZ) = £(C). □ 

Korte and Lovasz introduced a natural class of rooted sets determined by and determining 
an antimatroid. Let A be an antimatroid. A* the convex geometry dual to A, and r the closure 
operator of A*. A subset X is said to be free if {X n X | X G A} is equal to 2^. A circuit 
is a subset C such that C is not free and every proper subset of C is fre^ It is known in 
m Lemma 3.2] that there is an unique element r, called the root, in a circuit C such that 
r(C') — r 0 A* and t(C') — s G A* for s G C — r. Now a circuit C is regarded as a rooted set 
{C, r) for the root r of C. A critical circuit is a rooted set (C, r) such that r(C') — r 0 A* and 
r(C') — r — s G A* for s G C — r. It is known in jM] p.31] that a critical circuit is indeed a 
circuit. Circuits can determine the original antimatroid, and such circuits always contain all 
critical circuits, as follows. 

Theorem 2.6 ([29|; see |3Ql Theorem 3.11]). Let A be an antimatroid, and let S he the set 
of critical circuits of A. Then the following hold: 

(1) A = C{S). 

( 2 ) For any family C of circuits of A, if A = C{C), then SCO. 

We develop in Section |3.3| an algorithm to construct the set S of all critical circuits 
of A = A(TZ) from TZ. For this purpose, we use the following Boolean function theoretic 
characterization of circuits. A nontrivial implicate (A, q) of A is said to be prime if for every 
a G A, the rule (A — a, q) is not an implicate of A. This definition is consistent with prime 
implicates of the corresponding Horn CNF m Definition 1.21]. The set of prime implicates 
is shown to be equal to the canonical direct unit implicational basis in [9]. In the case of 
an antimatroid (convex geometry), Wild |3l| showed that this basis is exactly the set of all 
circuits. 

^Any free subset is a subset of Q° , and circuits are for the proper antimatroid on Q° . 
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Proposition 2.7 (implied by [9l Theorem 15, Corollary 22] and |34l Corollary 13]). Let A 
be an antimatroid. 

(1) For a circuit (C, r) of A, the rule {C — r,r) is a prime implicate of A. 

(2) For a prime implicate {A,q) of A, the rooted set {A + q,q) is a circuit of A. 

For completeness, we give a direct proof in Appendix. It should be noted that the circuit 
concept is extended to general closure spaces/union-closed families [2H], and an analogous 
characterization holds |36l Section 3.3]. 

In particular, for any union-closed family 1C, the set TZ of all prime implicates determines 
K. as K, = 1C{TZ). Thus we have the following. 

Corollary 2.8 ([29]; see [l2l Lemma 2]). Let A he an antimatroid, C the set of all circuits in 
A, and TZ := {{C — r,r) \ {C,r) G C} the corresponding set of rules. Then it holds 

A = A{TZ) = /C(7^). 

As remarked in (HU p. 137] and |25l Example 21], the set TZ* of critical rules (circuits) 
of an antimatroid A is not necessarily an implicational basis of A, i.e., A C IC{TZ*) possibly 
strict. 


3 Algorithms 


3.1 Membership and inference problems 

We give linear time algorithms for the membership and inference problems defined in the 
introduction. From now on, let n denote the cardinality of the ground set Q. The size 
1{TZ) := Y2{a input TZ is simply denoted by 1. We tacitly assume that I > n. 

We first provide a linear time algorithm to compute the maximum member X° C A in A(TZ) 
from given X FQ and TZ. Linear time algorithms for the membership and inference problems 


are immediate consequences (via Lemma 2.3). 

The idea of our algorithm is to trace a tight path on X(TZ) from the empty set. We show 
that a tight path to X, if it exists, is obtained by greedily adding elements from the empty 
set. 


Proposition 3.1. Let 1C be a union-closed family on Q. For any subset X of Q, the following 
statements are equivalent: 

(i) There exists a tight path from 0 to X in 1C. 

(ii) For any member S in JC, if S is a proper subset of X, there exists an element x G X\S 
such that S + X G TC. 

Proof. It is obvious that (ii) implies (i). Indeed, according to (ii), we can construct a tight 
path from 0 to A by adding elements of A. 

We next show that (i) implies (ii). Suppose that X contains a tight path 
9,{qi},{qi,q2}, ■ ■ ■ ,{qi,g 2 , ■ ■ ■ ,qm} = X. Let 5 be a member of X. Suppose that 5 is a 
proper subset of A. Take the minimum index i with qi ^ S. By the union-closedness of X, 
we have S + qi = S U {qi,q 2 , ■ ■ ■ ,qi} G X. Thus qi is a required element. □ 

According to Proposition |3.1[ we easily obtain the following algorithm to compute A°. 
Starting with 5 = 0, if there exists an element x G X \ S such that 5 + x is again a member 
of XiTZ), then add x to S, and repeat. If such an element x does not exist, then the current S 
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is actually equal to X°. In each iteration, we can check whether S' + a: is a member of IC{TZ) 
in 0{l). Therefore the time complexity of this algorithm is 0{'n?l). 

We are going to improve this naive algorithm by searching the feasible continuations x 
of the current S efficiently. We first point out that it requires only a part of the input TZ to 
check whether S + x belongs to K,{TZ). Indeed, we do not have to care about the elements not 
in X. Hence, instead of {A, q), we may keep the rule [A n X, g). If g is not in X, then (A, q) 
can be ignored. Also, if {A, q) satisfies A n S / 0, then (A, q) accepts S + x for any x, and 
can be ignored. Summarizing, instead of TZ, it suffices to keep, in each step, the following set 
of rules: 

:= {(AnX,g) I (A,g) e : AnS = 0,g G X\ A}. 

The next lemma explains how to obtain x with 5 + x G /C(A) from TZg. 

Lemma 3.2. Let X be an arbitrary subset of Q, and S a member of K,{TZ) properly contained 
by X. For x G X \ 5, the subset S + x belongs to )C{TZ) if and only if TZg has no rule of the 
form (A, x). 

Proof. The if part is obvious from the above discussion. To show the only if part, let (A, x) G 
TZg. Since An5 = 0 (and x ^ A), A has no intersection with S + x. Hence S + x ^ IC(TZ). □ 

Now we are ready to describe our algorithm. 

Algorithm 3.3 (to compute X°). 

Input: A set TZ of Horn rules and a subset X of Q. 

Output: X°. 

1 : 5 ^ 0 

2 : C := {x G X \ 5 I TZg has no rules of the form (A, x)} 

3: if C = 0 then 
4: return S 

5: end if 

6 : Choose q^C,S<—S + q{ovS<—Sue directly), update TZ^, and go to line 1 

We give a linear time implementation of this algorithm by using an appropriate data 
structure to maintain TZg. Suppose that TZ = {Ri, R 2 , ■ ■ ■, Rm} and Ri = {Ai,qi) for i = 
1, 2,..., m. In each iteration, we retain TZg by three kinds of lists Hx, Tx, and E. For each 
X G X \ 5, let Hx be the set of indexes i such that Aj n 5 = 0 and x G A*, and Tx be the set 
of indexes j such that x = qj and Aj n 5 = 0. Let E be the set of elements q G X \ S such 
that Tq is empty. Here Tx is kept by a doubly-linked list, and Hx and E are kept by a stack 
or queue. 

The initialization is done by the following. Look (A,, g*) for i = 1,2,... ,m. Append index 
i to Tq.. We also keep a pointer from i to “i” in the list Tq.. For each element x G Aj, push 
index i to Hx. After that, E is obtained by pushing elements g with Tq = 0. The total time 
is 0{l)-, recall n < 1. In each iteration, it holds C = E. Pop g from C, and add g to S. The 
update TZg is as follows. Pop all indices i from Hq, and remove i from the list Tq. by tracing 
the pointer from i (in constant time), since it now holds Aj n 5 = {g} 7 ^ 0. If Tq^ gets empty, 
then we push qi to E. The computation time of the update is 1^*1)- Since each rule 

in TZ is referred at most once, the total time complexity is 0{l). 

Theorem 3.4. For a set TZ of rules and a subset X, the maximum member X° U X in A{TZ) 
can be obtained in linear time. 


Dually speaking, the closure operator of the dual of AiTZ) is computable in linear time. 
Now a linear algorithm for membership problem (Theorem 1.2) is immediate: 
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Algorithm 3.5 (to solve the membership problem). 
Input: A set TZ of rules and a subset X. 

Output: YES if A G A{TZ)^ and NO otherwise. 

1 : Obtain X° by Algorithm |3.3[ 

2: if A = A° then 

3: return YES // A is a member of A{TZ). 

4: end if 

5: return NO // A is not a member of A{TZ). 


Recalling Lemma 2.3 for a characterization of implicates, we obtain a linear time algorithm 


for the inference problem (Theorem 1.3) as follows. 


Algorithm 3.6 (to solve the inference problem). 

Input: A set TZ of rules and a rule (A, q). 

Output: YES if (A,g) is an implicate of A{TZ), and NO otherwise. 
1 : Obtain {Q \ A)° by Algorithm 3.3 
ii q ^ {Q\ A)° then 

return YES // {A,q) is an implicate of A{TZ). 

end if 

return NO // {A,q) is not an implicate of AilZ). 


3.2 Generating all members of A{7Z) 

As an application of the membership algorithm, we here provide a simple and efficient al¬ 
gorithm to enumerate all members in AiJZ). The efficiency of an enumeration algorithm is 
measured by the time delay (interval) between two consecutive outputs. Our enumeration 
algorithm is of 0{nl) time delay, and is designed on the basis of the standard technique of 
the reverse search [ 8 ]. 

Let A = AiJZ) be an antimatroid given by a set 7Z of rules. Let Q = { 1 , 2 ,... ,re}; we 
will use the natural ordering <. Eollowing terminologies in [25], the outer fringe X^ and the 
inner fringe X^ of a member A G A. are defined by 

A^ := {x G g\ A I A + x G A}, (3.1) 

A^ := {xGA|A-xGA}. (3.2) 

Eor a nonempty member A in A, let 0(A) be defined as the element q in A^ such that q is 
added in the last iteration of Algorithm |3.3| for the input A(= A°). The map A i—)■ 0(A) is 
well-defined if the data structures of inputs TZ and A are fixed (so that the smallest q with 
respect to < is chosen from C in line 6 of Algorithm |3.3| ). Let A(A) be the set of elements x 
in A^ with x = 0(A -|- x). Let 7” be a directed graph on A, where an edge from A to X' is 
given if and only if A — 0(A) = A'. Every nonempty member A has exactly one edge leaving 
A. Thus we obtain the following. 

Lemma 3.7. T is a spanning rooted tree with root 0. 

Our algorithm is a depth first search on T- Starting at the root A = 0, if A is labeled 
(or output), then we next label A -|- x by choosing the smallest x from A(A) with unlabeled 
A -|- X. If such an x does not exist, then backtrack by computing 0(A). An important point 
is that this can be done only by a local information at A. 

Algorithm 3.8 (to enumerate all members in A{TZ)). 

Input: A set TZ of rules. 

Output: All members in A{TZ). 
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1: X 

2 : Output X. 

3: Compute X{X). 

4: if J-{X) is empty then 
5: go to line 11. 

6: end if 

7: Choose the minimum element y in T{X)^ X •(— X + y, and go to line 2. 

8: if X = 0 then 
9: stop. 

10: end if 

11: y ^ 4>{^) and X X — y. jj backtracking 

12: if there is no element x in J-{X) with x > y then 
13: go to line 8. 

14: end if 

15: Choose the minimum element x in X{X) with x > y, X X x, and go to line 2. 

We can compute in 0{l) time (Algorithm [3^, and can compute X{X) in 0{nl) time 
by n computations of (p. We estimate the time delay between consecutive output. Suppose 
that X is output at line 2. Next J~{X) is computed in 0{nl) time. If the algorithm goes to 
line 7, then it goes to line 2, and hence the delay is 0{nl) time. Suppose that the algorithm 
goes to line 11. The backtracking loop (lines 11 to 15) iterates at most n times. We need 
not to compute 4> and J- in the backtracking. In lines 7 and 15, if y and X{X) are pushed 
into a stack, then <?1(A) and J-{X) in lines 11 and 12 are popped off the stack. Thus the 
backtracking loop is conducted in O(n^) time. Summarizing, we have the following. 

Theorem 3.9. Algorithm\3.^ enumerates all members of A{TZ) with 0{nl) delay. 


3.3 Computing critical circuits 

As an application of the inference algorithm, we here present a quadratic time algorithm to 
construct all critical rules (or circuits) of antimatroid A{TZ) from TZ. Recall from Section |2.3| 
that a critical rule of an antimatroid A is a rule (A, q) such that the rooted set {A + q, q) is 
a critical circuit of A. Our algorithm is justified by the following lemma, which is a slight 
generalization of Theorem |2.6| (2). 

Lemma 3.10. Let TZ be a set of rules. For a critical rule {A,q) of A{TZ), there exists a rule 
{A', q) in TZ with A C A'. 

Proof. Let {A,q) be a critical rule for A = A{TZ). Let r denote the closure operator of the 
dual A* of A. Let X := Q\ t{A + q). By definition, A is a member of A but A + g is not. 
Then A + q is also not a member of IC{TZ). There necessarily exists some rule {A',q) G TZ 
such that AL n (A + q) = 0. Take s G A arbitrarily. Since (A, q) is a critical rule, by definition 
we have r(A + q) — q — s £ A* and X + q + s £ A. Hence A' n (A + g + s) 7 ^ 0. It follows 
that s £ A'. Thus we have A C A'. □ 


In particular, the set of critical rules is the unique minimal expression of an antimatroid 
(among all possible Horn representations). The following corollary sharpens Theorem 2.6 
and coincides with it if TZ corresponds to a set of circuits of A{TZ). 


Corollary 3.11. Let TZ be a set of rules, and let TZ* be the set of critical rules ofA(TZ). Then 
it holds that A{TZ*) = A{TZ), \TZ*\ < \TZ\, and 1{TZ*) < 1{TZ). 


In particular, the following inclusion holds: 


A{TZ*) = A{TZ) c K{TZ) c 1C{TZ*). 
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Notice again that the inclusions are strict in general, since that the set TZ* of critical circuits 
(rules) is not necessarily an implicational basis of AiJZ) or JC{TZ). 

Now we present a quadratic time algorithm to obtain this unique minimal expression. 

Algorithm 3.12 (to compute critical rules). 

Input: A set TZ of rules. 

Output: All critical rules. 

1 : for all {A, q) ^TZ do 
2 : for all o G A do 

3: if {A — a, q) is an implicate of A{TZ) then 

4: TZ A — TZ — (A, q^ (A — a, q^ 

5: A A — A — CL 

6: end if 

7: end for 

8 : if (A, q) is an implicate of A{TZ — (A, q)) then 

9: TZ A — TZ — (A, q'j 

10: end if 

11: end for 
12 : return TZ 


To implement for all in line 1 (resp. line 2) by a usual computer language, we index TZ 
as TZ = {(Aj, qi)}'^i (resp. A as A = {aj}j^i) and use for-loop from i = 1 to m (resp. j = 1 
to k). We call the inference algorithm (Algorithm |3.6[ ) in lines 3 and 8 . 

Theorem 3.13. Given a set TZ of rules, Algorithm \3.1T\ computes all critical rules of A{TZ) 
in time 0{P). 


Proof. The total time of the algorithm is 0{P) since the algorithm calls the inference algorithm 
I times. We show that the output is actually the set of critical rules. Notice that A{TZ) 
does not change in each step. Indeed, if (A — a,q) is an implicate of A{TZ), then A{TZ) C 
K,{TZ — {A,q) + (A — a,q)) C IC{TZ) holds, implying A{TZ) = A{TZ — {A,q) + (A — a,q)) 
by maximality (Theorem 1.1). Similarly, if {A,q) is an implicate of A{TZ — (A,g)), then 
A{TZ) C A{TZ—{A,q)) = A{TZ—{A,q))n}C{{{A,q)}) C KATZ), implying A(7^) = A{TZ—{A,q)). 

In lines 3 to 6 , the rule (A, q) becomes prime. Indeed, suppose to the contrary that 
(A — a, q) is an implicate of A{TZ) for some a G A. Then (A' — a, q) is not an implicate for 
A C A', where A' is equal to A at the step of a chosen. However, by A — a C A' — a and 
(E2), the rule (A — a,q) cannot be an implicate, a contradiction. Therefore, the output TZ 
consists of prime implicates. By lines 8 and 9, the set TZ becomes minimal. By Theorem 2.6 
this must be equal to the set of critical rules. □ 


By using Algorithm |3.12 , we can efficiently check whether two given sets of rules define 
the same antimatroid. Indeed, make both sets of rules critical, and compare them; they are 
equal if and only if they define the same antimatroid. 


Corollary 3.14. Given two setsTZ andTZ' of rules, we can determine whether A{TZ) = AiTZ') 
in time 0{P) with I = max{l{TZ),l{TZ')}. 


3.4 Generating all nontrivial implicates of A{TZ) 

It is a natural problem to construct a superset TZ' of given TZ that satisfies K{TZ') = A{TZ). 
We do not know a polynomial time algorithm to construct such a set TZ'] see Section for 
further discussion. We here provide a simple algorithm to a related problem of generating 
all implicates of A{TZ) from TZ. As was seen in Theorem 2.2 the set £ of all implicates of 
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AiJZ) obviously satisfies K.[£) = AiJZ). Our algorithm sharpens Dietrich’s construction [T^ 
Proposition 8 ] of all circuits from critical circuits, and may be comparable with the resolution 
principle (or the consensus procedure) in Boolean function theory, where the resolution is used 
to generate (prime) implicates of a Horn formula (or JC{TZ))] see [lU Chapter 6 ]. 

Our algorithm is based on the following variation of Proposition |2.4[ 

Lemma 3.15. For a set TZ of rules, it holds KATZ) = AiJZ) if and only if for every pair of 
nontrivial implicates {A, q) ,{A', q') oflCjZ), both {{AuA') — q —q', q) and {{AuAj — q — q',q') 
are implicates of K.JZ). 

Proof. The condition IC{TZ) = A{7Z) is equivalent to that K,{7Z) itself is an antimatroid. By 
Proposition |2.4| it suffices to show that the condition of the statement is equivalent to (AE'). 

(Only-if part). Here ((A U A) — q', q) and ((A U A') — q, q') are also implicates (by (El), 
(E2)). By (AE') with X = (AU A') — q — q', x = q, and y = q', both ((AU A') — q — q', q) and 
((A U A') — q — q', q') are implicates. 

(If part). Notice that (AE') trivially holds if z G A or y G A. Nontrivial cases of (AE') 
follow from letting A = X + y,A=X + z,y = q', and z = q. □ 

For two rules {A,q) and (A',g'), define a rule R{A, q; A', q') by 

R{A, q] A', q') := ((A U A) - q - q', q). 

Consider the following simple procedure: 

Algorithm 3.16 (to generate all nontrivial implicates of AiJZ)). 

Input: A set TZq of (nontrivial) rules. 

Output: All nontrivial implicates. 

1: IZ i — 'IZq 

2 : if there exist nontrivial rules (A, q), (A', q') G TZ such that R{A, g; A', q') or R{A, q'] A, q) 
does not belong to TZ then 
3: TZ i — TZ -\- R{A, q] A', q') + Ri^A, q'] A, q), 

4: and go to line 2. 

5: end if 
6 : return TZ. 


According to the analogy of the resolution, the procedure in line 3 is called an antimatroidal 
resolution. The following is a sharpening of m Proposition 8 ]. 


Theorem 3.17. Algorithm 3.16 computes all nontrivial implicates of A{TZq). In particular, 
it holds KJZ) = AJZJ). 


Proof. By Lemma 3.15, the output TZ{'D TZq) consists of implicates of AJZq), and it holds 


AiTZ) = AJZq). Let O be the set of all trivial rules (implicates). Obviously KiTZ) = KiTZUO) 
and AJZ) = AJZ U O). We claim that 7^ U O is an entailment. Suppose that this is true. 


Then the entailment 7^ U O is the set of all implicates of tCjZ U O) = tCjZ) by Theorem 2.2 


and satisfies the condition of Lemma 13.151 since TZ is closed under antimatroidal resolutions. 
Hence TCJZ U O) = AJZ U O) = A{TZ) = AJZq), and TZ is the set of all nontrivial implicates 
of A(7^o). 

Thus it suffices to show that TZVJ O satisfies (E2) (since (El) follows from (E2) and 
trivial implicates). Let (A, 6 i), (A, 62 ), • ■ ■, (A, 6 ^) and {B,q) be rules mTZllO with B = 
{bi, 62 , • • ■, bk}. We show that (A, q) £ TZU O. We may assume that q ^ All B (otherwise 
(A, q) £ TZU O). Suppose that (A, hi),..., (A, bk') £ TZ and (A, hk'+i),..., (A, bj £ O for 
some k' £ {1,2,..., k}. We show by induction that for i = 1,..., k', one can deduce (A U 
{B \ {bi,..., bi}),q) by antimatroidal resolutions, i.e., it holds (A U (77 \ {bi,..., bj), q) £ TZ. 
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The case i = k' \s the desired claim (since AU {B \ {6 i,..., bt}) = A). First, nontrivial rule 
{AU {B\ {5i}),g) is deduced from nontrivial (^, 6i) and {B,q). Next, assume that (^4 U 
{B \ {bi ,..., bi}), q) was deduced for some i with 2 < i < k' — 1 by antimatroidal resolutions. 
Then {Au{B\{bi ,..., 5j+i}), q) is deduced from {A, and (^U (i?\{6 i,..., bi}), q). Thus 
{A, q) O a.s required. □ 


4 Application to educational systems 


In this section, we mention possible applications of our results to the design of computer- 
aided educational systems. As mentioned in the introduction, an antimatroid is used as a 
mathematical model of the space of knowledge states of learners, and is called a learning 
space in the literature of Knowledge Space Theory (KST). The ground set Q is a set of 
questions in a certain domain, and the knowledge state of a learner is associated with a subset 
A C Q which he/she answers correctly. The collection of all possible knowledge states forms 
a family £ of subsets of Q. A KST-based educational system gives questions to a learner, 
estimates his/her knowledge state A G £ according to the answers, and poses questions for 
the subjects that he/she can acquire next. If the state of the learner reaches Q, then it might 
be said that the learner masters all subjects in the domain. The hypothesis that £ is an 
antimatroid (a learning space) is reasonable as well as useful in the above learning process. 
Indeed, for the state A of a learner, the outer fringe (defined in (3.1) in Section 3.2) is 
always nonempty, provided £ is an antimatroid. Therefore the system naturally chooses a 
question q from A^ and poses q to the learner. 

To realize the above learning process, the educational system needs to know, in advance, 
the space £ of knowledge states. The space £ is constructed with the help of a human expert 
(teacher) of the domain of the questions. It is practically impossible to ask to the expert 
whether A is a state in £ for all subsets X C Q, because this needs a huge number 2^^^ 
of queries. Koppen [26], Koppen and Doignon |2^, and Dowling-Muller [T6l [32| introduced 
an alternative procedure to construct K, by querying rules (Ai, qi), (A2,52), • •. to the expert, 
where the query {A,q) means: 


(Qyl,(j) “Does a learner fail the question q, provided he/she fails every question in A ?” 

Then the space of knowledge states is estimated as /C('P) for the set V of queries to which 
the expert said “yes”. The point is that it is not necessary to ask all queries, thanks to 
the inference rules (El) (E2). Let V and N be the sets of the queries to which the expert 
said “yes” and “no”, respectively. If {A,q) is an implicate of /C('P), then the query {A,q) is 
automatically determined to be ‘yes’ at this moment. Such a query {A,q) is called a positive 
inference of V. Let V* denote the set of all positive inferences of V. Similarly, there are 
queries automatically determined to be ‘no’ by the following rules obtained by (El) and (E2): 


(NI-1) If {A,p) is ‘yes’, {A,q) is ‘no’, and {A + p,b) is ‘yes’ for all b £ B, then {B,q) is ‘no’. 


(NI-2) If {A,p) is ‘yes’, {B,p) is ‘no’, and {B + q,a) is ‘yes’ for all a £ A, then {B,q) is ‘no’. 

(NI-3) If {A,p) is ‘no’, {B + q,p) is ‘yes’, and (A, 6) is ‘yes’ for all b £ B, then {B,q) is ‘no’. 

A negative inference of V and AA is a ‘no’ query {B, q) obtained by repeated applications 
of (NI-1), (NI-2) and (NI-3). Let N* denote the set of all negative inferences of V and M. 
Dowling |16j suggests a useful characterization of negative inferences: 

(NF) (A, q) £ M* if and only if some query in M is an implicate oi KAV + (A, q)). 
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Thus there are three types of queries in each step: positive inferences, negative inferences, and 
other queries that are called undetermined. Notice again that positive inferences and negative 
inferences are redundant to be asked. 

The QUERY algorithm updates positive and negative inferences by using the inference 
rules, once the expert returns the answer. The algorithm next chooses and gives an undeter¬ 
mined query to the expert. Koppen [26] suggested the following selection rule of queries. In 
the first stage, queries are of form ({p}, q). In the second stage, queries are of form {{p,p'}, q). 
In the f-th stage, queries are of form {A, q) with j^l = i. He also suggested a stopping criterion 
that after the i-th stage there is no undetermined query (H, q) with |H| = i + 1. Positive and 
negative inferences are collected in a table, and are used by the selection of queries asked 
to the expert. There are several selection rules of queries \12\ Section 15.2.9]. The resulting 
space /C('P) of knowledge states is constructed from the table; see [ 22 l Section 15.2]. 

Dowling |16) developed a sophisticated version of the QUERY algorithm. Instead of keep¬ 
ing all positive and negative inferences, her algorithm keeps the base of current IC{V) and the 
set m{N*) of maximal negative inferences. A maximal negative inference is a negative infer¬ 
ence {A, q) with the property that (A', q) is not a negative inference for every A' D A. Then a 
query (R, q) is a negative inference if and only if R C A for some maximal negative inference 
{A,q). Thus the set A/"* of negative inferences is manageable by the set of maximal 

negative inferences. Recall the the base R of /C(R) is the set of members that cannot be the 
union of other members of fC{V). The inference problem (i.e., checking whether {A,q) G V*) 
can be easily solved by the base; see sni Proposition 3.2]. Dowling gave explicit formulas of 
updating B and once the expert returns an answer [161 Theorems 4.1, 4.2, 4.3]. All 

states of the resulting space /C('P) are efficiently generated from R; see [T7] . 

Revised QUERY algorithm. The QUERY algorithm was designed for the case where 
the target space is assumed to be a union-closed family (a knowledge space). Therefore the 
output JC is not necessarily an antimatroid. We present a simple revision of the QUERY 
algorithm to output an antimatroid. Our revision is obtained by replacing /C('P) with A(V), 
and is understood as a concrete realization of Doignon’s adjusted QUERY algorithm m- 
As above, let V and M denote the sets of queries to which the expert said “yes” and “no”, 
respectively. We can naturally define positive/negative inferences for the revised QUERY. A 
strong positive inferenee of V is an implicate of A(R), and a strong negative inference of V 
and AA is a query {A,q) such that some query in M is an implicate of A{V + (A,g)). Let 
R** and AA** denote the sets of all strong positive and negative inferences, respectively. We 
remark 

V* C R**, Af* C Af** 

by A(R) C /C(R) and A(R -|- (A, q)) C /C(R -|- (A, q)). Strong positive or negative inferences 
are redundant to be asked, since the estimated space is now A(R). The revision of QUERY 
is as follows. 

Revised QUERY algorithm 

0 : Let V = =Af = Af** := 0 

1 : Choose a query (A, q) 0 R** UAf** (with smallest |A|), and ask the question (Qa,<j) to the 
expert. 

2: If the answer is “yes”, then add (A, q) to R (, and make R critical). If the answer is “no”, 
then add (A, q) to Af. 

3: Update V** and Af**. 

4: If a stopping criterion is fulfilled, then output R (or A(R)); stop. Otherwise go to step 1. 
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Let us look details of this algorithm. In step 3, we can use Algorithm |3.16 to generate all strong 
positive inferences, i.e., all implicates of A{V). In the practical situation where the storage for 
queries is limited (or Q is large), we can use the inference algorithm (Algorithm |3.6[ ) to generate 
strong positive inferences {A, q) with a specihed size of |A|. Similarly, we can generate strong 
negative inferences by applying the inference algorithm to queries in Af for A{V + {A,q)). 
Koppen’s termination criterion and other selection rules of queries are implementable. By 
the use of Algorithm 3.12, we may keep V compact. All states in A{V) can be efficiently 


generated by Algorithm 3.8 


In the case where the output is required to be a proper antimatroid, we may modify step 
2 as: If the answer is “yes” and Q G A{'P + (A, g)), then add (A, q) ioV. Actually Doignon’s 
adjusted QUERY adopts this rule. The condition Q G A('P + (A, g)) can be checked by the 
membership algorithm (Algorithm |3.5[ ). 

In the idealistic case where the answers of the expert correctly follows his/her latent 
antimatroid T, by Theorem o it always holds 


c c A{V) C /C(R). 


Therefore the resulting A{V) might be a reasonable outer approximation of the true antima¬ 
troid C. Also sufficiently many queries uncover C a.s C = AiV); see [HI Proposition 13]. 


Remark 4.1. J.-P. Doignon asked what about the revised QUERY is applied to the expert 
following a union-closed family £. Also, in this case, £ C /C('P) holds throughout iterations. 
Therefore, by maximality (Theorem 1.1), £ C A{V) holds. We could not guarantee that the 
revised QUERY reaches £ = A{V). The reason is the existence of a query in J\f that is ‘no’ for 
£ but is ‘yes’ for £. Consequently, even if a query (A, g) satishes £ C A(P + (A,g))cA(P), 
the query (A, g) may fall into A/"** and is never posed. 

This is not the case if a query is chosen outside V** U M* in each iteration. In fact, if 
A{V) \ £ 7 ^ 0, there is (A, g) 0 V** U AA* such that (A, g) is an implicate of £ and £ C 
A(P -I- (A, g)) C A('P); such a query is posed to decrease A('P). To see this, consider m inim al 
X G A{V) \ £, and consider X° with respect to £. By the minimality and Proposition 


3.1 


it 


must hold \X \ Y°| = 1. Therefore X is not in £, and there is an implicate (A, g) of £ (and of 
£) not accepting Y(g A(T’)). Obviously (A,g) 0 P**. Also (A, g) 0 AA* by £ C/C (£’-)-(A, g)). 
Thus (A, g) is a desired query. 


Remark 4.2. Our revision can incorporate Dowling’s update of maximal negative inferences. 
Her formulas m Theorems 4.2, 4.3] only involve the closure operator of (dual of) XiV). Thus 
the desired update formulas are obtained simply by changing the closure operator of K,{V) 
to the closure operator of A{V). By Theorem 3.4, the closure operator of A(V) is efficiently 
computable. On the other hand, the base update m Theorems 4.1] seems not to be adapted 
directly to A('P). This issue is left to future work. 


Example 4.3. We give one small but instructive example. Let Q = {0,1,2,3} and TZ 
consist of two rules ({0,2},!) and ({1,3},0). The goal is to identify £ := A{Tl) by QUERY 
algorithms. Queries are examined, in the same order, for both original and revised QUERY 
algorithms. Algorithms terminates if £ = X{V) for original and £ = AiV) for revised. 
Table describes the behavior of two algorithms. The hrst column indicates queries, which 
are examined from top to down, and the second and third columns indicate the actions of 
the original and revised QUERY, respectively. Here posed:YES (resp. posed:NO) means that 
the query in the same row is posed to the expert and the answer is “yes” (resp. “no”), and 
negainf means that the query is a (strong) negative inference (in revised QUERY) and is not 
posed. The first 16 (nontrivial) queries of form (A, g) with jA] < 1 are posed for both original 
and revised, and the answers are all “no”. In total, the original QUERY posed 25 queries to 
the expert. The revised QUERY posed 22 queries, and hnished two queries earlier than the 
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Table 1: Behavior of original and revised QUERY algorithms 


query 

original 

revised 

{A,q): |T| < 1 

posed:NO 

posed:NO 

({ 0 , 1 }, 2 ) 

posed:NO 

posed:NO 

({0,1}, 3) 

posed:NO 

posed:NO 

({ 0 , 2 },!) 

posed:YES 

posed:YES 

({0,2}, 3) 

posed:NO 

posed:NO 

({0,3},!) 

posed:NO 

posed:NO 

({0,3}, 2) 

negainf 

negainf 

({ 1 , 2 }, 0 ) 

posed:NO 

negainf 

({1,2}, 3) 

negainf 

negainf 

({1,3},0) 

posed:YES 

posed:YES 

({1,3},2) 

posed:NO 


({2,3},0) 

posed:YES 



original QUERY, which is caused by £ = AiJZ) C K.{JZ). Query ({1,2},0) is not a negative 
inference but a strong negative inference, and hence is not posed in the revised QUERY. 


Preliminary experimental results. We conducted preliminary computer experiments to 
investigate how the revision contributes to the reduction of the number of queries asked. We 
prepare, in computer, a target antimatroid £ and an idealistic expert who answers query 
[A, q) correctly. Namely the expert answers “yes” if {A, q) is an implicate of £, and “no” 
otherwise. The goal is to identify £ by queries. We compare two QUERY algorithms. The 
first algorithm is (a simpler version of ) the original QUERY algorithm. In each step, the 
algorithm poses a query {A,q) 0 V* U AA*, where V and M are the sets of ‘yes’ queries and 
‘no’ queries, respectively, obtained so far. The algorithm terminates when /C('P) = £. The 
second algorithm is a simpler version of our revised QUERY algorithm, which is obtained from 
the first one by replacing V* U N* with V** U M** and replacing the termination criterion 
/C('P) = £ with A{V) = £. Thanks to the algorithms in the previous section, they are 
efficiently implementable in computer. We compare the numbers of queries posed to the 
expert. 

The experiment was done as follows. The ground set Q consists of 10 elements. We applied 
the above two algorithms to 200 instances of target antimatroids £, and compared the numbers 
of queries posed to identify £, where the target antimatroid £ is given by £ := A{JZq) for a 
set TZq of randomly chosen 10 rules. The both algorithms examine, in the same order, queries 
[A, q) from smaller |A|. We count the number of queries that are posed to the expert. 

The result is summarized as follows. In average, the first algorithm posed 2128 queries 
and the second algorithm posed 1525 queries. Tablej^shows the distribution of instances with 
respect to the reduction of queries. Here the rate r of cut of queries (from original to revised) 
is defined as 


N1-N2 

Ni 


X 100, 


where Ni and N 2 denote the numbers of queries posed, respectively, by the original algorithm 
and by the revised algorithm (for the same target). In particular, the revision achieves the 
average cut rate 29%. This reduction of queries was mostly caused by negative inferences. 
Eor queries {A,q) with smaller T, the answers tend to be ‘no’. Indeed, 99% of queries posed 
are ‘no’ (i.e., posed:NO); the average number of queries that the expert said “no” is 2107 for 
the first algorithm and is 1505 for the second algorithm. Consequently, the query reduction 
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Table 2: Result 


Rate r of cut 

Number of instances 

0 < r < 10 

1 

10 < r < 15 

2 

15 < r < 20 

7 

20 < r < 25 

37 

25 < r < 30 

55 

30 < r < 35 

70 

35 < r < 40 

24 

40 < r 

4 


by (strong) negative inferences is more powerful than that by (strong) positive inferences. In 
the most successful instance, the cut rate is 42%. 

We also conducted the same experiment for the reverse ordering of queries, where queries 
are examined from larger |yl|. (This is a difficult situation for a human expert.) Also in this 
case, our revision effectively reduces the number of queries posed. In average, the original 
QUERY posed 434 queries and the revised QUERY posed 207 queries. Thus the cut rate 
is 52% in average. Compared with the above ordering, the numbers of required queries are 
considerably small for both the original and revised. This may be caused by our random 
construction of instances. Contrary to the above ordering of queries, posed queries tend to 
be ‘yes’; the average number of queries that the expert said “yes” is 390 for the original and 
is 163 for the revised. Consequently strong positive inferences contribute the reduction of 
queries effectively. 

These experimental results show that the revised QUERY has a potential to drastically 
reduce the burden of human experts in the antimatroid design in KST-based educational 
system. 

5 Concluding remarks 

In this paper, we have studied the representation TZ i—)■ A{TZ) of an antimatroid from algorith¬ 
mic and Boolean function theoretic points of view, and mentioned its potential applications 
to actual educational system designs. There remain several algorithmic questions that are 
interesting from both theoretical and practical sides. We end this paper with some open 
problems and future research issues. 

How to recognize whether IC(TZ) is an antimatroid. We have mainly focused on A{TZ) 
that is always an antimatroid. As we mentioned in the introduction, an antimatroid A always 
admits a set TZ of rules with A = KAJZ). Thus it is natural to give a characterization of a set 
7Z of rules such that K,{7Z) is an antimatroid, or equivalently, ICiJZ) = AiJZ). Related to such 
a characterization, it is quite natural to consider the following decision problem; 

Input: A set TZ of rules. 

Task: Decide whether 1C{TZ) is an antimatroid. 

We do not know whether this problem is in NP, though it is not difficult to show that it is in 
co-NP. 

Proposition 5.1. The problem of deciding whether KATZ) is an antimatroid is in co-NP. 
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Proof. Suppose that JC{Tl) is not an antimatroid, i.e., )C{Tl) / ^(7^). There is X G }C{Tl) \ 
A{TZ). Therefore we can check that X is in fC{TZ)\A{TZ) by linear time membership algorithms 
for JC{TZ) and for ^(7^). This gives a polynomial certificate for a NO instance. □ 


We have seen in Corollary 2.8 a sufficient condition for IC{TZ) = A{TZ). Namely, if TZ 
corresponds to the set of circuits of an antimatroid, then K,{TZ) = A[TZ) holds. By Dietrich’s 
characterization of an antimatroid by circuits [12] (see |3Q1 Theorem 3.9]) we can determine 
in polynomial time whether TZ corresponds to the set of circuits of an antimatroid. Also if 
all nontrivial (prime) implicates of ^(7^) are given (e.g., by Algorithm 3.16), then we check 
whether fC{TZ) = .4(7^). But this approach never gives a polynomial time algorithm, since the 
number of all nontrivial (prime) implicates may be exponential in input size 1{TZ). Another 
approach is to find the base B of JC{TZ), and to use an algorithm of Eppstein, Falmagne, 
and Uzun |20| for checking whether B is the base of an antimatroid. However \B\ may be 
exponential of 1{TZ). See [Ml Section 3.6] for computational issues on prime implicants, bases, 
and rules (implications). 

Adaricheva and Nation |3| introduced a notion of a closure system with unique criticals 
( UC-systems), where a convex geometry (antimatroid) is a particular example of a UC-system; 
see also [I]. They showed a polynomial time algorithm to decide whether a given set of rules 
defines a UC-system |2l Proposition 45]. This algorithm checks a necessary condition for IC{TZ) 
to be an antimatroid. 


Toward computational learning theory for antimatroids. Building an antimatroid by 
querying an expert in Section should also be discussed and analyzed from the view point of 
computational learning theory, particularly from Angluin’s framework [5| of learning Boolean 
functions by queries; see a survey |33j . The problem formulation is the following. The task 
is to identify {learn) a family L of subsets (or a Boolean function) by a certain (logical) 
expression, such as CNF. Here we are allowed to use a certain kind of an oracle that returns 
information of target C. Typical oracles are: 

Membership oracle: The query is a subset X. The oracle returns “yes” if X G T, and “no” 
otherwise. 

Equivalence oracle: The query is a family CJ. The oracle returns “yes” if CJ = £, and “no” 
otherwise. If the answer is “no”, then a subset X G LXC! is also returned. 

There are several results on query learning of Horn functions, or equivalently, union-closed 
families. Angluin, Frazier, and Pitt [6] gave an algorithm to learn a Horn function (a union- 
closed family £) by 0{mn) membership and 0{m^n) equivalence queries, where n is the 
number of variables (the cardinality of the ground set Q) and m is the number of clauses of 
the Horn formula (the number of a set TZ of rules with C = 1C{TZ)). See also a recent related 
work |7|. Frazier and Pitt [23] considered the entailment oracle for a Horn function, and gave 
an algorithm to learn a Horn function by a polynomial number of entailment and equivalence 
queries, where the entailment oracle is: 

Entailment oracle: The query is a rule {A,q). The oracle returns “yes” if {A,q) is an 
implicate of C, and “no” otherwise. 

The problem of building a space of knowledge states by querying an expert, considered in 
Section [^ may be formulated mathematically as the problem of learning a Horn function by 
the entailment oracle. In our setting, it is practically impossible to let a human expert play the 
equivalence oracle, and it may be difficult to apply these results to actual educational system 
designs. Nevertheless it is quite interesting to develop a practically feasible and theoretically 
efficient learning algorithm for spaces of knowledge states, particularly antimatroids, from 
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the viewpoint of computational learning theory. It should be noted that the above learning 
algorithms identify target space £ by £ = JC{TZ). In the case where £ is assumed to be an 
antimatroid, it is natural to identify £ by £ = A{TZ), as in our revised QUERY algorithm. 
In this setting, an alternative learning algorithm with a better theoretical guarantee may be 
possible. 

Largest extension of an antimatroid. Adaricheva and Nation [2] showed (in dual form) 
that for any antimatroid AonQ there exists a unique maximal antimatroid AonQ containing 
A as a sublattice, where A is called the largest extension of A. This suggests a way of 
associating a set TZ of rules with the largest extension AiJZ) of AiJZ). A natural question 
is: How can we handle A{TZ) by TZ efficiently ? A naive membership algorithm for A{TZ) 
obtained from the definition [21 p.l99 (E)] requires checking a condition for all members of 
A{7Z), and is far from polynomial. An algorithmic theory for association TZ i—)• AiJZ) as well 
as its application to KST will deserve an interesting future research. 
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A Appendix 

A.l Relation to Horn fnnctions 

We summarize a relation to Horn functions in Boolean function theory; see m Ghapter 6] for 
detail. A Boolean function is a {0, l}-valued function / dehned on {0, !}”■, i.e., / : {0,1}"^ —)• 
{0,1}. A family JC on Q = {1, 2,..., n} is identified with a Boolean function f/c dehned by 
flcixi,X 2 , • •., Xn) = 1 if {i 1 Xj = 1} G /C and 0 otherwise. 

A Boolean function / is called Horn if it is represented as a Horn CNF: 

m / 

f{xi,X2, . . . ,Xn) = /\ Y Y Xj 
i=l \jePi j&Ni 

where Pi and W are disjoint subsets of {1, 2,..., n} with \Pi\ < 1 for i = 1, 2,..., m. Here 
Xi = 1 — Xi, Xi V Xj := max{xi,Xj}, and xi A Xj := min{xj,Xj}. For A = {*1,^2, • • ■ Ak} and 
B = • • • Cl}, MieA^j denotes xq V Xi2 V • • • V xq V% V • • • V %. 

Suppose that Q = {1, 2,..., re}. For a Boolean function /, let T(/) denote the set of points 
X G {0,1}"^ = 2^ with /(x) = 1. For a rule {A,q) with A = {pi,P 2 , ■ ■ ■ ,Pk}, the Boolean 
function fA,q is dehned by 

fA,q{xi,X2, . . . ,X„) = % V Xp 2 V • • • V Xpj^ V Xq. 

For a set TZ of rules, we obtain a Horn function f-j^ by 

fn-= /\ fA,q 
[A,q)e'R. 
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Then the family IC{TZ) and the Horn function f-ji are equivalent objects in the following sense. 
For X G {0,1}”, let I^{x) denote the set of elements i G Q with Xi = 0 (0-supports). 

Lemma A.l. For a set TZ of rules, it holds IC{TZ) = | x G T{f'pi)}. 

Notice that if we associate a rule {A, q) {A = {pi,P2, ■ ■ ■ ,Pk}) with VXpj V • • • VXp^, Vxg, 
then a set TZ of rules corresponds to a dual Horn function gn, and IC{TZ) corresponds to the 
set of 1 -supports of T{gTi). 

In particular we can use known results of Horn functions to 1C{TZ). For example, deciding 
whether {A,q) is an implicate for IC(TZ) is equivalent to deciding whether an arbitrary x 0 
T{fA,q) satisfies x 0 T{fTi). This is solved by the following. Fix variables Xg = 0 and Xi = 1 
for i ^ A. Substitute them to /-/^ and then obtain another Horn CNF f. If f is satishable 
(i.e., 3x G T{f')), then the answer is NO. Otherwise the answer is YES. It is well-known that 
the satisfiability problem for Horn CNF is efficiently solved; see m Section 6.4.1]. 


A.2 Proof of Proposition |2.4| 

(1). For the set C of all circuits and the corresponding set TZ of rules, it holds A = C{C) = 
A(TZ) G JC{TZ) by Theorems 2.5 and 2.6 This means that for any circuit (C, r), the rule 


{C — r, r) is actually an implicate of A. We next show that {C — r, r) is prime. Let a be an 
arbitrary element of C — r. By definition, C — a is free, and hence there exists a member K 
of A such that Kr\{C — a) = {r}. Then K contains r and satishes A n (C — a — r) = 0. This 
means that {C — a — r,r) is not implicate of A. Thus (C — r, r) is a prime implicate. 

(2). Next, suppose that {A,q) is a prime implicate. Namely, for any element a of A, the 
rule {A —a, q) is not an implicate of A. Notice that A-|-g is not free, since there is no member 
A of A with {A + q)r\K = {g}. Choose an arbitrary element x of A + q. We are going show 
that A + q — x IS free; then it follows that {A + q, q) is a circuit by definition. Here ACi X 
denotes {X n A | A G A} for simplicity. We first consider the case where x = q. Let a G A. 
By assumption, (A — a, q) is not an implicate of A. Then there exists a member A of A such 
that q ^ K and A n (A — a) =0. But K (1A A since (A, q) is an implicate of A. Therefore, 
A n A = {a} and A n A contains {a}. Since A is closed under union, so is A H A. It follows 
that A + q — q = A is free. 

We next consider the case where x G A. Let a G A — x. By the same discussion as above, 
there exists a member A of A such that q G K and A n A = {a}. We take a minimal such A. 
By the minimality of A and the accessibility of A, at least K — a or K — q must be a member of 
A. But A—a is not a member of A, since (A, q) is an implicate of A and we have q G K—a and 
(A — a)nA = 0. Hence K — q is a member of A. Since [K— q)r\{A — x+q) = An(A— x) = {a}, 
the family A n (A — x -|- g) contains {a}. We show that Ar\ [A — x + q) also contains {g}. 
Since (A, q) is an implicate of A but (A — x, q) is not, there exists a member L of A such that 
q G L and L n A = {x}. We have L(A{A — x + q) = {g}. Therefore, A n (A — x -|- contains 
all singletons {a} with a G A — x -|- g. Since A H (A — x -|- g) is closed under union, the set 
A — x-|-g = A-|-g — xis free. Thus we conclude that (A -|- g, g) is a circuit. 
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